Goto

Collaborating Authors

 driver attention


Saliency-Based Attention Shifting: A Framework for Improving Driver Situational Awareness of Out-of-Label Hazards

Shleibik, Yousra, Sinclair, Jordan, Haring, Kerstin

arXiv.org Artificial Intelligence

The advent of autonomous driving systems promises to transform transportation by enhancing safety, efficiency, and comfort. As these technologies evolve toward higher levels of autonomy, the need for integrated systems that seamlessly support human involvement in decision-making becomes increasingly critical. Certain scenarios necessitate human involvement, including those where the vehicle is unable to identify an object or element in the scene, and as such cannot take independent action. Therefore, situational awareness is essential to mitigate potential risks during a takeover, where a driver must assume control and autonomy from the vehicle. The need for driver attention is important to avoid collisions with external agents and ensure a smooth transition during takeover operations. This paper explores the integration of attention redirection techniques, such as gaze manipulation through targeted visual and auditory cues, to help drivers maintain focus on emerging hazards and reduce target fixation in semi-autonomous driving scenarios. We propose a conceptual framework that combines real-time gaze tracking, context-aware saliency analysis, and synchronized visual and auditory alerts to enhance situational awareness, proactively address potential hazards, and foster effective collaboration between humans and autonomous systems.


SalM$^{2}$: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver Attention

Zhao, Chunyu, Mu, Wentao, Zhou, Xian, Liu, Wenbo, Yan, Fei, Deng, Tao

arXiv.org Artificial Intelligence

Driver attention recognition in driving scenarios is a popular direction in traffic scene perception technology. It aims to understand human driver attention to focus on specific targets/objects in the driving scene. However, traffic scenes contain not only a large amount of visual information but also semantic information related to driving tasks. Existing methods lack attention to the actual semantic information present in driving scenes. Additionally, the traffic scene is a complex and dynamic process that requires constant attention to objects related to the current driving task. Existing models, influenced by their foundational frameworks, tend to have large parameter counts and complex structures. Therefore, this paper proposes a real-time saliency Mamba network based on the latest Mamba framework. As shown in Figure 1, our model uses very few parameters (0.08M, only 0.09~11.16% of other models), while maintaining SOTA performance or achieving over 98% of the SOTA model's performance.


Modeling Drivers' Risk Perception via Attention to Improve Driving Assistance

Biswas, Abhijat, Gideon, John, Tamura, Kimimasa, Rosman, Guy

arXiv.org Artificial Intelligence

Advanced Driver Assistance Systems (ADAS) alert drivers during safety-critical scenarios but often provide superfluous alerts due to a lack of consideration for drivers' knowledge or scene awareness. Modeling these aspects together in a data-driven way is challenging due to the scarcity of critical scenario data with in-cabin driver state and world state recorded together. We explore the benefits of driver modeling in the context of Forward Collision Warning (FCW) systems. Working with real-world video dataset of on-road FCW deployments, we collect observers' subjective validity rating of the deployed alerts. We also annotate participants' gaze-to-objects and extract 3D trajectories of the ego vehicle and other vehicles semi-automatically. We generate a risk estimate of the scene and the drivers' perception in a two step process: First, we model the movement of vehicles in a given scenario as a joint trajectory forecasting problem. Then, we reason about the drivers' risk perception of the scene by counterfactually modifying the input to the forecasting model to represent the drivers' actual observations of vehicles in the scene. The difference in these behaviours gives us an estimate of driver behaviour that accounts for their actual (inattentive) observations and their downstream effect on overall scene risk. We compare both a learned scene representation as well as a more traditional ``worse-case'' deceleration model to achieve the future trajectory forecast. Our experiments show that using this risk formulation to generate FCW alerts may lead to improved false positive rate of FCWs and improved FCW timing.


STDA: Spatio-Temporal Dual-Encoder Network Incorporating Driver Attention to Predict Driver Behaviors Under Safety-Critical Scenarios

Xu, Dongyang, Luo, Yiran, Lu, Tianle, Wang, Qingfan, Zhou, Qing, Nie, Bingbing

arXiv.org Artificial Intelligence

Accurate behavior prediction for vehicles is essential but challenging for autonomous driving. Most existing studies show satisfying performance under regular scenarios, but most neglected safety-critical scenarios. In this study, a spatio-temporal dual-encoder network named STDA for safety-critical scenarios was developed. Considering the exceptional capabilities of human drivers in terms of situational awareness and comprehending risks, driver attention was incorporated into STDA to facilitate swift identification of the critical regions, which is expected to improve both performance and interpretability. STDA contains four parts: the driver attention prediction module, which predicts driver attention; the fusion module designed to fuse the features between driver attention and raw images; the temporary encoder module used to enhance the capability to interpret dynamic scenes; and the behavior prediction module to predict the behavior. The experiment data are used to train and validate the model. The results show that STDA improves the G-mean from 0.659 to 0.719 when incorporating driver attention and adopting a temporal encoder module. In addition, extensive experimentation has been conducted to validate that the proposed module exhibits robust generalization capabilities and can be seamlessly integrated into other mainstream models.


Cognitive Accident Prediction in Driving Scenes: A Multimodality Benchmark

Fang, Jianwu, Li, Lei-Lei, Yang, Kuan, Zheng, Zhedong, Xue, Jianru, Chua, Tat-Seng

arXiv.org Artificial Intelligence

Traffic accident prediction in driving videos aims to provide an early warning of the accident occurrence, and supports the decision making of safe driving systems. Previous works usually concentrate on the spatial-temporal correlation of object-level context, while they do not fit the inherent long-tailed data distribution well and are vulnerable to severe environmental change. In this work, we propose a Cognitive Accident Prediction (CAP) method that explicitly leverages human-inspired cognition of text description on the visual observation and the driver attention to facilitate model training. In particular, the text description provides a dense semantic description guidance for the primary context of the traffic scene, while the driver attention provides a traction to focus on the critical region closely correlating with safe driving. CAP is formulated by an attentive text-to-vision shift fusion module, an attentive scene context transfer module, and the driver attention guided accident prediction module. We leverage the attention mechanism in these modules to explore the core semantic cues for accident prediction. In order to train CAP, we extend an existing self-collected DADA-2000 dataset (with annotated driver attention for each frame) with further factual text descriptions for the visual observations before the accidents. Besides, we construct a new large-scale benchmark consisting of 11,727 in-the-wild accident videos with over 2.19 million frames (named as CAP-DATA) together with labeled fact-effect-reason-introspection description and temporal accident frame label. Based on extensive experiments, the superiority of CAP is validated compared with state-of-the-art approaches. The code, CAP-DATA, and all results will be released in \url{https://github.com/JWFanggit/LOTVS-CAP}.


Autonowashing And Waymo's Approach To Fully Autonomous Vehicles

#artificialintelligence

Waymo is a self-driving car company, but they don't particularly like using that terminology. Instead they prefer fully autonomous as a more accurate way to describe driverless or autonomous driving technology. What consumers may not fully understand is the difference between self-driving and fully autonomous. A self-driving car is a type of vehicle that can provide some level of automation like ADAS (Advanced Driver Assistance System) or automatic cruise control. It still requires driver attention for proper operation or it can lead to accidents.


DADA-2000: Can Driving Accident be Predicted by Driver Attention? Analyzed by A Benchmark

Fang, Jianwu, Yan, Dingxin, Qiao, Jiahuan, Xue, Jianru, Wang, He, Li, Sen

arXiv.org Artificial Intelligence

Driver attention prediction is currently becoming the focus in safe driving research community, such as the DR(eye)VE project and newly emerged Berkeley DeepDrive Attention (BDD-A) database in critical situations. In safe driving, an essential task is to predict the incoming accidents as early as possible. BDD-A was aware of this problem and collected the driver attention in laboratory because of the rarity of such scenes. Nevertheless, BDD-A focuses the critical situations which do not encounter actual accidents, and just faces the driver attention prediction task, without a close step for accident prediction. In contrast to this, we explore the view of drivers' eyes for capturing multiple kinds of accidents, and construct a more diverse and larger video benchmark than ever before with the driver attention and the driving accident annotation simultaneously (named as DADA-2000), which has 2000 video clips owning about 658,476 frames on 54 kinds of accidents. These clips are crowd-sourced and captured in various occasions (highway, urban, rural, and tunnel), weather (sunny, rainy and snowy) and light conditions (daytime and nighttime). For the driver attention representation, we collect the maps of fixations, saccade scan path and focusing time. The accidents are annotated by their categories, the accident window in clips and spatial locations of the crash-objects. Based on the analysis, we obtain a quantitative and positive answer for the question in this paper.


We need self-driving cars that can monitor our attention along with our emotions

#artificialintelligence

Last month, for the first time ever, a pedestrian was killed by an autonomous vehicle. Elaine Herzberg's death at the hands of a self-driving Uber vehicle in Arizona has spurred a crisis of conscience in the autonomous vehicle industry. Now, engineers and startups are scrambling to shift the focus to technology that they say could help prevent future self-driving collisions, especially as more and more autonomous vehicles are expected to hit the road in the future. One such startup is Renovo Auto, a Silicon Valley company that has developed an operating system that integrates all the software needed to run a fleet of autonomous vehicles. You might remember the Renovo Coupe, a $529,000 electric supercar with 1,000 pound-feet of torque and a 0–60 time of 3.4 seconds, or, more recently, its project to convert a DeLorean with an electric powertrain and then do autonomous donuts with it.


Transport safety body rules safeguards 'were lacking' in deadly Tesla crash

The Guardian

The chairman of the US National Transportation Safety Board (NTSB) said on Tuesday that "operational limitations" in the Tesla Model S played a "major role" in a May 2016 crash that killed a driver using the vehicle's semi-autonomous Autopilot system. The limits on the autonomous driving system include factors such as Tesla being unable to ensure driver attention even when the car is traveling at high speeds, ensuring Autopilot is used only on certain roads and monitoring driver engagement, NTSB said. The NTSB recommended auto safety regulators and automakers take steps to ensure that semi-autonomous systems are not misused. "System safeguards were lacking," NTSB chairman Robert Sumwalt said. "Tesla allowed the driver to use the system outside of the environment for which it was designed and the system gave far too much leeway to the driver to divert his attention."